sychological 


Monographs 
General and Applied 


Development and Application of Tests for 
University Students in Norway: A Report on 
Parts of a Research Project 


By 


Pyvind Skard, Inger Marie Aursand, and 
Leif J. Braaten 


Universtty of Oslo 


“= 
Q 
> 
3 
QO 
3 
om 
Q 
ce 


Price $1.00 


Edited by Herbert S. Conrad 
Published by The American Psychological Association, Inc. 


No. 383 
1954 
Vol. 68 
No, 12 


: Psychological Monographs: 
_ General and Applied 


Combining the Applied Puychology Monogra()): and the Archives of Psychology 
Paychologica! Monographs 


Editor 


S. Conran 
Of Flealth, and Welfare 
Office of Education 
Washington 25, D.C. 
‘Managing Editor 


Bouruier 


3 Consulting Editors 


Donarp E. Haroip E. Jones 
Frank A, BEACH DonaLtp W. MacKinnon 
Rosert G. BERNREUTER Lorrin A. Rices 

WituiaM A, BROWNELL R. Rocrrs 

Harorp E. Burtt SAUL ROSENZWEIG 

Jenny W. Carrer, Jr. Ross STAGNER 


Ciype H, Coomas Percivat M. SyMonps 
Joun F. Joszrn TrrFin 
FuGEnia HANFMANN Lepvarp R Tucker 
Enna HEWBREDER Joseru ZuBiIn 


Manusaurts should be sent to the Editor. 

Because of lack of space, the Psychologica! Monographs can print only the wiles 
or advanced contribution of the author. Background and bibliographic materials 
must, in general, be totally excluded or kept (0 an irreducible minimum. Statistical 
tables should be used to present only the mos: important of the statistical data or 
evidence. 

The first page of the manwsctipt should contain the title of the paper, the author's 
name, and his institutional connection (or his city of residence). Acknowledgments 
should be Kept brief, and appear as a footnote on he first page. No table of contents 
need ‘e included. For other directions or suggestions on the preparation of manu- 
scripts, wee: Conrap, H. §. Preparation of manuscripts for publication as mono 
graphs. j. Psychol., 1948, 26, 447-459. 

CORRESPONDENCE CONCERNING BUSINESS MATTER: (such 15 author's fees, subseri 
and sales, change of address, etc.) should be addressed to the American Psychological 
Association, Inc., 1gg§ Sixteenth St. N.W.. Washington 6, D.C. Address changes 
must arrive by the 25th of the month to take ¢ {lect the {u!lowing month. Undelivered 
copies resulting from addres changes will not be replaced; subscribers should notify 
the post office that they will guarantee third-c! ass ‘orwarding postage. 


COPYRIGHT, 1954, BY THE Awerican PsycHoLocical Association, Ina, 


Vol. 68, No. 12 


Psychological Monographs: General and Applied 


Whole No. 383, 1954 


Development and Application of Tests for University Students 


in Norway: A Report on Parts of a Research Project 
@Myvind Skard, Inger Marie Aursand, and Leif J. Braaten 


HE work of student guidance and 

counseling in Norway has thus far 
been done only by nonpsychological 
agencies, and has consisted mainly of 
amateur guidance to provide help on 
scholarships, study loans, housing, em- 
ployment, etc. 

Since, especially after the last war, the 
number of students going to the univer- 
sity increased and competition for jobs 
became much more severe, and since only 
a limited number of students were ac- 
cepted for certain fields of study, the pres- 
sure on the individual student made itself 
more keenly felt. This has brought to 
the fore the need for professional psycho- 
logical assistance in student personnel 
work in Norway. The desirability of a 
psychological student guidance and coun- 
seling center is today generally admitted 
in Norway. 

The present report describes the initial 
research designed to lay a good scientific 
foundation for such an agency. When the 
research project reported here was 
started, no tests for students existed in 
Norway. We felt that much of the work 
done in the field, especially in the United 
States, might be of great help. For this 
reason most of our work so far has con- 
sisted in adapting for use in Norway some 
of the better known tests for American 
students. The project has been developed 
in close contact with the Institute of 
Psychology at the University of Oslo, 
and many of the psychology students 
there have taken part in the research 


University of Oslo, Oslo, Norway 


work and received training in psycho- 
metric research techniques. 

The project has been made possible 
primarily by a large donation from The 
Grant Foundation, Inc., but the Univer- 
sity of Oslo and The Norwegian General 
Research Council have also contributed 
financially during the last stages of the 
project. The present report covers a 
period of about four years. 

The most important single project has 
been to adapt the Yale Educational Apti- 
tude test battery for use in Norway. This 
project will be given the most space in 
the present report. But the report will 
also cover our work with the American 
Council on Education Psychological Ex- 
amination for College Freshmen and with 
different interest schedules. 

Other tests have also been developed 
and tried out as part of the whole project. 
A medical aptitude test has been devel- 
oped and used for three years; a test for 
language usage and comprehension has 
been developed and tried out; a person- 
ality test has been tried out; etc, But not 
all these projects have yet reached a stage 
where they can be fully reported. 

We want to express our gratitude to 
the American Council on Education, The 
Psychological Corporation, and to Drs. 
A. B. Crawford, P. S. Burnham, L. L. 
Thurstone, E. K. Strong, G. F. Kuder, 
and Hugh Bell for their generosity in 
letting us use their tests and material 
freely. Without their help the research 
project would still be in its initial stage. — 


9%. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


I. EXPERIMENTS WITH A GENERAL INTELLIGENCE TEST 


With the generous permission of the 
American Council on Education we used 
the ACE Psychological Examination for 
College Freshmen to develop a test of 
general scholastic ability, The test used 
in our experiments consists of a transla- 
tion of the quantitative part and an 
adaptation of the verbal part of the ACE. 
In the verbal part, we had to make con- 
siderable changes and construct new 
items that were better suited for use with 
Norwegian students. The Norwegian 
form of the test was tried out in all the 
usual ways and revised according to the 
results obtained. We had some difficulties 
in the placing of items according to 
difficulty. When time limits were in- 
creased above a certain limit, some of the 
items lost all their discriminating power. 
We also found that when an item was 
moved forward or backward in a subtest 
because of its degree of difficulty, the 
change of position altered its degree of 
difficulty so much as to indicate the 
existence of a fairly strong time factor 
operating in most of the original subtests. 
We felt that for use with our students we 
would have preferred a test in which the 
frequency of solution of the different 
items is a function of the difficulty of the 
items themselves, and not so much a 
function of the order of the items in the 
various subtests. The items may be too 
easy for our students because the age of 
Norwegian students in the last grade of 
high school is about the same as that of 
an American college sophomore. We 
decided nevertheless to try out the test. 
Our reliabilities were satisfactory, the 
corrected odd-even reliability being +.94. 


Results with Different Groups of Students 


a. Group means. The means for the 
different groups of students vary con- 


siderably. A number of quite interesting 
mean scores are given in Table 1. 

The results given in Table 1 are based 
on rather small groups, but they cor- 
respond with results obtained later with 
other tests. The high school language 
line is usually regarded as easier than the 
science and mathematics line. Students 
following the science and mathematics 
line have to take much more advanced 
mathematics and physics than do students 
following the language line. The most 
important of the university studies that 
accept only a limited number of students 
(technology, medicine) state among their 
requirements that the students should 
have matriculated from the science and 
mathematics line in high school. For the 
rest of the University studies for which 
admission is restricted (pharmacy, dentis- 
try, agriculture, veterinary, medicine), 
matriculation from the science and math- 
ematics line is regarded as the best quali- 
fication. This is one of the main reasons 
why most of the gifted students choose 
the science and mathematics line in high 
school. University freshmen studying 
liberal arts have almost all matriculated 


TABLE 1 
MEAN Scores OF DirFERENT GROUPS 


(American Council on Education Psychological 
Examination for College Freshmen) 


N Mean SD 


High school, language line, 
fifth grade 87 
University freshmen, liberal 
arts 
Teachers college, nonstu- 
dent classes 
High school, science and 
mathematics line, fifth 
grade 55 
University freshmen, natu- 
ral science 113 
Teachers College, student 
classes 121 


Group 


87.7 
124 93-9 


95-9 


105.5 
105.5 


110.8 


20.9 
20.3 
20.9 
17.2 


TESTS FOR UNIVERSITY STUDENTS IN NORWAY 3 


from the language line in high school. 
Their mean test score is above the high 
school liberal arts students, but below the 
high school science and mathematics 
students. The mean of the Teachers 
College nonstudent classes is above the 
mean of the liberal arts university stu- 
dents. This is probably due to the rather 
severe selection procedure in the Teach- 
ers College. The high school science and 
mathematics students and the science 
and mathematics university students have 
about the same mean score. This is quite 
reasonable, because there is no selection 
procedure for the science and mathe- 
matics studies at the university—and most 
of the students graduating from high 
school science and mathematics line con- 
tinue their studies at the university. The 
student classes at the Teachers College 
proved to have the highest average mean 
score of all the groups. This was to be 
expected, because these students are 
selected from among the best high school 
graduates. 

b. Predictive value of the test. Because 
of the great influence of speed in the test 
results, we did not expect the test to 
show any very valuable correlations: with 
grades in high school or university 
courses 

The studies of the predictive value of 
the test are based on rather small groups 
of students. The correlation between 
total score on the test and high school 
grade average was found to be +.52 on 
the science and mathematics line and 
only +.2g on the language line. (The 
groups consisted only of 55 and 88 stu- 
dents, respectively.) It is difficult to ex- 
plain the great difference between these 
two coefficients. It may be due partly to 
the fact that grades on the science and 
mathematics line are presumably more 
reliable than grades on the language 
line. 


Most of our validation studies were 
done at the Teachers College. Students 
may enter the Teachers College in Nor- 
way alter matriculation from high school 
and finish their studies in two years. 
These students are selected mainly on the 
basis of high school grades, and only 
students with very good grades are ad- 
mitted. For those who have not finished 
high school, there is a four-year study 
program at the Teachers College. ‘These 
students are selected on the basis of a 
series of oral and written entrance ex- 
aminations, 

For a group of-121 students in the two- 
year program we found a correlation 
of +.46 between total test scores and 
final grades. This is of course not very 
high, but when one considers how care- 
fully selected this student group is, it is 
rather surprising that the correlation co- 
efficient is as high as it is. 

For 102 students in the four-year course 
we obtained a correlation of +.53 be- 


‘tween test result and average grades; it 


varied for different classes from +.18 to 
+.71. 

For a proper evaluation of these results 
one ought to have the correlation be- 
tween the ordinary or traditional selec- 
tion factors and final grades from ‘Teach- 
ers College. These correlations have not 
been calculated. But we can use similar 
studies from other countries to evaluate 
our results. Stuit, Dickson, Jordan, and 
Schloerb (2) give an average correlation 
of +.38 between high school grades and 
four-year average grades in teacher train- 
ing school. The correlation is somewhat 
higher with grades after only one or two 
years of work. They say: 

1. The median coefficient of correlation of 
-+- .51 indicates that a significant degree of rela- 
tionship obtains between the quality of high 
school work and performance in the first year 


of a teacher-training course. 
2. The quality of high school work will prob- 


4 ~. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


ably have less predictive value in forecasting how 
well an individual will perform on the higher 
level, that is, beyond the first year, judging from 
the obtained median coefficient of correlation 
of +-.38 (pp. 142-148). 

There was good reason to expect that 
the correlation between high school 
grades and final grades at Teachers Col- 
lege for our student group would be even 
lower than the correlations reported by 
Stuit, et al. Our Teachers College stu- 
dents are all selected on the basis of their 
high school grades, and hence the varia- 
tion in grades is very restricted. For the 
medical students, who are selected on a 
similar basis, we have found correlations 
varying from —.16 to +.23 between high 
school grades and final grades. Compared 
to these correlations and the results 
quoted from Stuit, et al., our correlations 


of +.46 for the student group and +.53 
for the nonstudent group are rather high. 
They can be supposed to have predictive 
value as high as, or higher than, those 
measures now used as the basis of selec- 
tion. 


Conclusion 


We have at present two parallel forms 
of the general intelligence test in Nor- 
wegian, In spite of the test’s deficiencies, 
which consists mainly of its being rather 
too much influenced by speed, the test has 
proved to have predictive value as high 
as, or higher than, the measures used for 
selection in the Teachers College. The 
test also differentiates in a clear and con- 
sistent way between groups of students of 
different levels of intellectual ability. 


II. A DIFFERENTIAL APTITUDE TEST BATTERY 


In 1949 and 1950 a Norwegian edition 
of the Yale Educational Aptitude test 
battery (1) was tried out on different 
groups of students in Norway. This 
battery was originally developed at Yale 
University by Albert B. Crawford and 
Paul S. Burnham. It consists of seven 
tests: test I is a test of verbal comprehen- 
sion, test II an artificial language test, 
test III a test of verbal reasoning, test IV 
of quantitative reasoning, test V of 
mathematical aptitude, test VI of spatial 
visualizing, and test VII of mechanical 
ingenuity. Tests I, II, and III are sup- 
posed to indicate aptitude for liberal arts 
studies; tests III, IV, and V for pure 
science and mathematics, and tests V, VI, 
and VII for technological studies, The 
battery represents something between 
pure ability testing and knowledge test- 
ing. It involves knowledge which all who 
have gone to high school may be assumed 
to have acquired, and requires use of this 


knowledge to solve new problems. 

Purpose. The purpose of our project 
with the Yale test battery was twofold. 
We hoped to make an adaptation of the 
battery that would be usable for selection 
and guidance work with Norwegian stu- 
dents. We also hoped that the test battery 
might reveal some of the differences be- 
tween student groups as regards the abili- 
ties measured by the tests. 

Development of the test battery. Tests 
V, VI, and VII were translated into Nor- 
wegian without any changes. There was 
no reason to suppose that the translation 
would alter the nature or degree of difh- 
culty of these tests. On the other hand, 
it was clear that in test I, part of tests II 
and III, and to a smaller degree test IV, 
the nature and difficulty of the test items 
were so dependent upon language that 
these factors would be altered in transla- 
tion. This necessitated the construction 
of some new items. As regards the relia- 


TESTS FOR UNIVERSITY 


bility of the tests, it was presumed that it 
would not be the same for the transla- 
tion. However, since all the tests are fair- 
ly long, the reliabilities were not ex- 
pected to become so low that they would 
be unusable. 

An analysis of the results of the tryout 
of the battery showed that for the fourth 
grade in high school (which we consider 
equivalent to the American college fresh- 
man year) the tests were either too long 
or had too short time limits. The distribu- 
tion of results also in general showed a 
positive skewness. This skewness dimin- 
ished gradually for the student groups of 
more advanced education and a higher 
level of ability. The test battery that is 
to be used for counseling in high school 
will therefore be changed so that the posi- 
tive skewness disappears; the tests are to 
be shortened. For future use with univer- 
sity students, the test battery will be used 
as it is except for very small and unim- 
portant changes. 


The reliability coefficients of the seven 
tests are given in Table 2. 


TABLE 2 
RELIABILITY COEFFICIENTS 
(Yale Educational Aptitude Test Battery) 


Reliability 


A number of these coefficients are consider- 
ably higher than usual, especially the coefficients 
for tests If and VI. The high reliability for 
test VI is due mainly to subtest 16, which has 
the greatest influence of the subtests on the 
total score of test VI; its reliability is as high 
as .9896 + .ooz. In analyzing the results on sub- 
test 16, we found that almost all items which 
had been attempted were correctly solved, and 
that to the point where the time for the sub- 
test was used up every item was usually worked 


STUDENTS IN NORWAY 5 


on. This all points to the conclusion that a 
speed factor had a decisive influence on the re- 
sults of subtest 16 and test VI. (In a new edition 
of the test battery we are planning to change 
subtest 16.) A speed factor usually has the effect 
of raising the reliability coefficient of a test. 
We also found that a speed factor had a great 
influence in parts of test II. 

For the tests and subtests which were sup- 
posed to be most affected by translation, and 
which were partly made up of new items, an 
item analysis was carried out. This was not at 
this stage done for tests V, VI, and VII, for 
which the translation was not supposed to have 
changed the difficulty of the items to any great 
extent. Tests II and IV might have been changed 
to a certain extent, but their reliabilities were 
so high that we did not find it necessary to 
make an item analysis. 

Groups tested. Five different groups of 
students were tested: 402 students in the 
fourth grade of high school, 295, students 
in the fifth grade of high school, 84 
liberal arts university students, 118 pure 
science and mathematics university stu- 
dents, and 206 students from the Institute 
of Technology in Trondheim. In high 
school whole classes were tested. The 
testing of the university students had to 
be done on a voluntary basis. The only 
criterion available to test their repre- 
sentativeness of the student groups to 
which they belonged was the average 
grade for the high schoo] matriculation 
examination. Chi-square tests showed 
that the tested university students in pure 
science and mathematics were, on the 
average, superior to the university group 
to which they belonged. For the other 
university groups the chi-square tests gave 
no reason to reject the hypothesis that 
the tested students were representative of 
their different student groups. All the 
university students had been studying for 
from half a year to three years at the time 
of testing. A group of 317 of the fourth- 
grade high school students was used as a 
norm group for the tests. Of these stu- 
dents, 155 were taking the language line 
in high school, and 162 were taking the 


Test 
I 94 
II .98 
Ill -93 
IV .96 
Vv -95 
VI -98 
VII 94 


~. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


TABLE 3 


INTERCORRELATIONS BETWEEN TESTS FOR 
YALE FresHMen (N = 856) 


(Yale Educational Aptitude Test Battery) 


V VI VII 
-19 -24 
+23 +32 
+38 
-49 -50 


science and mathematics line. The ratio 
between these two numbers is the same as 
that between the total numbers of stu- 
dents following the different lines in the 
same districts. 

Intercorrelations and factor analysis. 
The intercorrelations for the seven tests 
in the battery which have been found 
for Yale freshmen are given in Table 3. 
For comparison, the intercorrelations 
found for Norwegian high school pupils, 
fourth grade (which is about equivalent 
to the American college freshman), are 
given in Table 4. 

We see from Tables g and 4 that (with 
one exception) all the intercorrelations 
found in Norway are higher than those 
found at Yale. This is also true for the 
intercorrelations found for Norwegian 
university students in the humanities, 
in science and mathematics, and for 
technology students. It is possible that 
this may be because Yale freshmen are 


TABLE 4 


INTERCORRELATIONS BETWEEN TESTS FOR 
STUDENTS IN NORWEGIAN 
HiGcu Fourta Grape 
N= 317) 
(Yale Educational Aptitude Test Battery). 


-48 -45 
-62 


VI VII 


allowed, while in high school, more free- 
dom in choosing the subjects they take, 
so that they arrive at a certain degree 
of specialization earlier than Norwegian 
high school and university students. An- 
other probable reason is the following: 
American studeuts are accustomed to be- 
ing tested all through their schooling; 
they are quite accustomed to the way the 
test problems are stated, to the method 
of working, and to the answering tech- 
nique in tests like the Yale battery. The 
Norwegian high school and university 
students who took part in the tryout had 
very seldom or never experienced this 
sort of testing before, and the answer- 
ing technique was entirely new to all 
of them (i.e., using answer sheets designed 
for stencil scoring). They lacked the test- 
taking ability which most American stu- 
dents have acquired. ‘This made the time 
limits too short for Norwegian students. 
It introduced a general “clerical speed 
factor’ into the battery. 

On the basis of intercorrelations be- 
tween scores for fourth-grade high school 
students on all 20 subtests in the battery, 
a factor analysis of the test results was 
made. This was done mainly because it 
will be of great help in our future work 
with the battery to know as much as pos- 
sible about the different factors which 
operate in the test battery. We can also 
study more closely the nature of each in- 
dividual test, and possibly see whether 
or not any of the subtests ought to change 
place or be dropped. 

The correlation matrix was factored 
according to Thurstone’s complete cen- 
troid method of factoring. After the 
necessary rotations of the factor matrix, 
we ended with the final factor matrix 
given in Table 5. (The reference axes in 
the factor structure are not orthogonal.) 

By using the formula R,, = D (A‘A)* 
D, where the diagonal matrix D is calcu- 


6 
Test) IV 
Il 
IV | 
Vv 
VI 
test | 
I 
Ill 
IV 
Vv .70 
VI 66 


TABLE 5 
Factor MATRIX 


FoAos = Va 
AjmXmp = V jp 


Subtest Ag Be Cy D. 


I .083 -038 —.024 
2 — .089 -672 .065 o21 
3 — .123 -710 -0406 O15 
4 -O14 +412 -576 024 
5 -004 -448 -603 —.076 
6 -004 -414 —.087 
7 400 +009 -173 
8 — .056 -§1§ —.052 336 
9 -086 -432 —.153 .227 
10 -313 -002 +327 -254 
II -454 -107 .065 .065 
12 .321 .062 +072 386 
13 -341 —.164 . 248 -338 
14 -393 078 .182 252 
15 +332 064 -107 277 
16 —.026 300 —.048 
17 504 117 039 
18 -516 —.062 170 .040 
19 o71 —.008 052 
20 -554 047 —.123 034 
Zz 4-783 4.729 3.015 2.168 


lated in such a way that the diagonal 
values in matrix R,, = 1, we found the 
correlations between the factors in 
Table 5. These correlations are given in 
Table 6. 

When factoring tests of the kind in 
question here, one cannot expect to get 
any clearly defined factors which can 
forthwith be given names derived from 
the content of the tests. The test results 
are to a high degree the result of, and 
modified by, acquired knowledge, pre- 
vious education, and special training in 
definite ways of formulating problems 
and in treating various sorts of material. 


TABLE 6 
INTERCORRELATIONS OF FACTORS 


Factor A B D 
A 1.000 .646 .050 -479 
B -646 1.000 -106 
Cc -050 -106 1.000 


TESTS FOR UNIVERSITY STUDENTS IN: NORWAY 


/ 


Furthermore, the nature of the tests 
themselves is such that one can expect 
different factor analytical results from 
groups of testees who vary with regard 
to the degree and kind of their education. 
The results given here are therefore only 
valid for the group of students whose 
results have been used, namely Nor- 
wegian fourth-grade high school students. 
Specialized training has not yet got very 
far in the fourth grade of high school. 
The most manifest result of the factor- 
ing may be said to be that adequate 
grounds seem to exist for dividing the 
test battery into two main groups. The 
first group, which includes tests I, II, 
and III, may be described as a wide lin- 
guisticverbal group. The second, which 
includes tests IV, V, VI, and VII, may 
be described as a wide quantitative- 
Spatial group. It has not been possible 
to discriminate any clearly  spatial- 
mechanical factor or corresponding 
group of subtests from the quantitative- 
spatial group. There may be several rea- 
sons for this. It may be that the group 
of students which has had special train- 
ing in the numerical-mathematical field 
is also the group that has had the most 
training in the spatial field (geometry, 
trigonometry, projective drawing), and in 
the special kind of problems and method 
of solving them which are needed in 
tests VI and VII. The other students 
tested have had relatively little training 
in all these fields. The effect is that any 
quantitative-numerical field and any spa- 
tial field will tend to coincide in the 
factor analytical results. 
As to the composition of the subtests 


- in the battery, the results seem to indi- 


cate that the subtests are, on the whole, 
well chosen. Most of the subtests which 
go together in a test lie close to each 
other in most planes in the factorial 
hyperspace, or are clearly separated from 


8 


the other subtests in one plane. The 
only doubtful tests are the subtests in 
tests IV and V. They seem to cover the 
same field, and it is possible that it is 
not necessary to use them all. 

The same influences which affected the 
intercorrelations between the tests have 
also masked the results of the factor 
analysis, leading to the rather high inter- 
correlations between factors seen in 
Table 6, 

Results with different student groups. 
The testing in high school showed that 
the test battery differentiated clearly be- 
tween the science and mathematics and 
the language lines even as early as the 
fourth grade, about a year and a half 
after the students had been separated 
into the different lines. This is clearly 
shown in Fig. 1. The difference between 
the two lines in high school is very pro- 
nounced and perhaps a little greater in 


" Vv v vi 


0 
4 4 4 4 
™S 
4 4 4 
4 4 
4 4 J 
3 
4 4 4 J 
> + + T 
wi 1 1 1 1 the 


———-—«sMean socres, Solence and Mathematics line, = 155. 
——— Mean scores, Language line. 162. 


Fic. 1, Mean test scores for science and mathe- 
matics line and for language line, in fourth 
grade of high school. (The Roman numerals 
refer to test I, test II, test III, etc. of the bat- 
tery.) 


%. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


" Iv v vu fo 
J 4 4 4 
4 4 4 
70 + + + 
4 4 4 4 
4 4 4 4 4 4 
OP 
i 
4 2s 
4 4 
4 4 1 4 
4 4 4 4 4 » 
4 4 4 
} 4 4 4 4 4 4 
1 4 = 4 4 1 
» i L 4o 
—— sone 49.70 $8.23 5779 5761 5519 87.67 
4837 47.60 49.07 45.73 010 44.88 


Mean scores, Science and Mmthematics line. N= 145. 
—_— — Mean scores, Language line. N= 10. 


Fic. 2, Mean test scores for science and mathe- 
matics line and for language line, in fifth grade 
of high school. (The Roman numerals refer to 
test I, test I, test III, etc. of the battery.) 
the fifth grade, after another year of 
differentiated studies. This is shown in 
Fig. 2. 

An examination of Fig. 3 shows that 
the average profiles of the high school 
students of the language line, in the 
fourth and fifth grade, and of the liberal 
arts university students are similar. The 
high school students of the pure science 
and mathematics line, in the fourth and 
fifth grade, the pure science and mathe- 
matics university students, and the tech- 
nology students also have similar average 
profiles (See Fig. 4). The differences be- 
tween the profiles of the groups shown 
in Fig. 3 and in Fig. 4 are considerable 
and statistically reliable for most of the 
tests. ‘The high school students of the 
language line and the liberal arts uni- 
versity students have a lower average pro- 
file for all the tests than the student 
groups of corresponding educational level 


in the other group. This was the case 
even on the tests of verbal comprehen- 
sion, artificial language, and verbal reas- 
oning. As the profiles can be supposed 
to be partly the result of the nature of 
the studies of the different groups, this 
result is rather astonishing. The students 
of the language line and the liberal arts 
university students are supposed to get 
special training in verbal comprehension 
and linguistic ability. The difference be- 
tween the results for the liberal arts stu- 
dents and a more representative sample 
of pure science and mathematics students 
may be smaller than the differences found 
here. But as the differences on most of 
the tests are very large, there is reason 
to believe that most of the differences 
would be statistically reliable even in 
such a case. 


7 
+ 
4< 


+ 


} 


Standard score 

\ 

| 

/ 

/ 


A 
+44 


3} A 
iy 
17 


4 
a, 
5 4 4 4 4 
30 + + + + + 
4 4 4 
4 4 4 4 
— 52.0 56.3 sis 
500 “5 440 “6 


Liberal Arts Univ. students N= 84 
—— — Language Line, Sth grade high school. N « 152 
Language Line, 4th grade high school. « 162 


Fic. 3. Mean test scores for liberal arts uni- 
versity students, for language line students in 
fifth grade of high school, and for language line 
students in fourth grade of high school. (The 
Roman numerals refer to test I, test If, test III, 
etc. of the battery.) 


TESTS FOR UNIVERSITY STUDENTS IN NORWAY 


4 
7 4 
4 
w+ 
4 4 
: 4 4 4 4 
— 607 $37 634 605 
$24 ore tes 650 
$09 “7 $82 $78 576 $62 7 
515 53.0 $55 oe $65 


Technology students. N « 206 

Science and Math. students, 116 

+++ Selence and Math. Line, 5th grade high school. N+ 145 
—+—— Science and Math. Line, 4th grade high school, WN « 156 


Fic. 4. Mean test scores for university tech- 
nology students, university science and mathe- 
matics students, and high school science and 
mathematics line students (fourth grade and fifth 
grade, respectively). (The Roman numerals refer 
to test I, test II, test III, etc. of the battery.) 

Results from Yale University show that 
prospective liberal arts freshmen do much 
better on the average than engineering 
students on the tests of verbal compre- 
hension and artificial language, while 
the engineering students are better on 
the other tests. There are probably many 
reasons why the Norwegian liberal arts 
students score below the other student 
groups on all tests. The main reason may 
be that in order to enter any of the 
many studies where admission is _re- 
stricted, you have to graduate from the 
pure science and mathematics line in 
high school. Another possible reason is 
the rigidity of the division into lines in 
high school, and the fact that the lan- 
guage line has no subject which equals 
mathematics in difficulty. 

We cannot take these profiles as “ideal” 


9 
u v v vu % 
80 100 
4 4 
DJA 
WY” 


10 ~. SKARD, I. M. AURSAND, AND L, J: BRAATEN 


for the different fields of studies. For the 
technology students admission to studies 
is restricted. In the case of these students 
the average profile will, of course, to a 
large extent be a result of the admission 
procedure. The value of the admissior. 
procedure in forecasting results of study 
will depend upon the actual prediction 
value of the information used in select- 
ing students for admission, When selec- 
tion is not based on results of scientific 
investigations, there is a possibility that 
the students will be selected on the basis 
of factors that are not essential for their 
studies. This may also happen to some 
extent in the case of curricula to which 
admission is free, because of the self- 
selection that is due to what the students 
think is needed, and on account of some 
social halo surrounding the vocations 
which the curricula lead to. 

The profiles can also be supposed to 


4 
» 1 L 
—— 6061 6357 ? OO 7.42 


1 1 do 


Notts 


year technology students, N= 114. 
——— Third year technology students, N= 92. 


Fic. 5. Mean test scores for first-year technology 
students and for third-year technology students. 
(The Roman numerals refer to test I, test II, 
test IIT, etc, of the battery.) 


be partly the result of the kind of spe- 
cialized education the group has had. 
But there is reason to believe that 
further specialized education after the 
students have finished high school does 
not influence the average profiles of the 
groups very much. A comparison of two 
different groups of technology students, 
one having studied about half a year and 
another about two and a half years, 
showed that the profiles were very similar 
in spite of the greater amount of special- 
ized education of the latter group. This 
is shown in Fig. 5. 

The investigation in high school has 
shown that, as early as the fourth grade, 
there is undoubtedly a certain connection 
between the test results and the kind of 
studies leading up to the matriculation 
examination that the students take. Be- 
cause of the resemblance of the high 
school profiles to the profiles of the differ- 
ent university groups, there is also 
reason to suppose that there is a certain 
connection between a high school stu- 
dent’s test profile and his fitness for the 
different fields of academic study. 

Validation studies. The validation 
study of a test battery is never finished 
as long as the test battery is still used. 
So far the Norwegian edition of the 
Yale Educational Aptitude test battery 
has been used only experimentally in 
Norway. Its administration has not had 
any consequences for the students taking 
it. They were all informed before the 
testing that the aim of the testing pro- 
gram was not to give the students re- 
liable information about their abilities, 
but to gather data which would make it 
possible to evaluate the test battery as 
an instrument for vocational guidance 
and selection, and which would make it 
possible to refine the battery as a measure 
of academic aptitude. 

The fact that the testing was not “seri- 


' " v v fe 
| 
SSS" 
ISS SSS SS 
i 


TESTS FOR UNIVERSITY STUDENTS IN NORWAY 


ous” to the students may well have in- 
fluenced their motivation. As testing of 
this kind is a fairly new and unfamiliar 
experience for most Norwegian students, 
we had whole groups of students who 
were openly hostile or deliberately in- 
different, some of them during the actual 
testing periods, All these motivational 
factors worked in the direction of de- 
creasing the value of the results. 

In some of our validation studies our 
student groups were very small. 

For all these reasons the results pre- 
sented here are given rather hesitatingly, 
but we believe that in spite of all weak- 
nesses and the preliminary nature of our 
validation studies, the results may be of 
interest, and they may show us what to 
take into account in future development 
of the test battery. 

The liberal arts students who took the 
test battery have spread themselves over 
so many different fields of study that 
groups of students having grades in the 
same subject are very small. The groups 
we have been able to find so far are too 
small to calculate correlation coefficients. 
If we take all the students who have 
passed an examination in a foreign lan- 
guage (German, English, French) as one 
group, we get the largest group of lib- 
eral arts students (N = g2) that can be 
lumped together for validation purposes. 
We have divided this group into three 
subgroups: the group with the best 
grades, the group with medium grades, 
and the group with the lowest grades. Fig- 
ure 6 shows the average test results for 
these three groups, the tests having been 
administered one, two, or three years be- 
fore the students had passed their foreign 
language examination. 

The subgroups are, of course, very 
small. Nevertheless it is evident that the 
test results discriminate clearly between 
the students with the best aptitude for 


Standard score 


vee 


Group with best grades N 10 
——— Group with medium grades N« 12 
Group with low grades N+ 10 


Fic. 6. Mean test scores for three groups of 
university students who had passed an examina- 
tion in a modern foreign language: (a) group 
with best grades, (b) group with medium grades, 
and (c) group with low grades. (The Roman 
numerals refer to test I, test I, test II, ete. of 
the battery.) 


foreign languages, and the students with 
medium and little aptitude for foreign 
languages. Tests I and II seem to dis- 


criminate better than the other tests. 
This is to be expected because of the 
nature of these tests. 

For pure science and mathematics stu- 
dents and for technology students we 
have calculated several correlation co- 
efficients between test results and grades. 
The grades were received from one to 
three years after the testing. For the 
pure science and mathematics students 
we had to combine the grades in physics 
and chemistry in order to get a fairly big 
group (N= 71). This of course makes 
the correlations between the tests and this 
particular criterion less valuable. The 
correlations found for pure science and 
mathematics and technology students so 


~. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


TABLE 7 


CORRELATIONS BETWEEN TEST SCORES AND GRADES OF NORWEGIAN STUDENTS 


Group 


Combined p!iysics-chemistry 
Technology students 
Mathematics 
Descriptive geometry 
Mechanics I 
Physics 


Average grade, first half of 
studies 


Individual work and paper (the- 
sis) on special problem for 
final examination 

Average grade, second half of 
studies 


Final grade 


far are given in Table 7. 

At first sight these correlation coeffi- 
cients may seem rather discouraging; 
none of them is very high. We have 
already mentioned several reasons why 
very high correlations were not to be 
expected from the test results obtained 
so far. More important than the indivi- 
dual correlation coefficients at the present 
stage is Table 7 as a whole. Table 7 
demonstrates that none of the tests are 
tests of general scholastic aptitude in the 
university, and that success in a special 
field of study cannot be predicted equally 
well from all the tests in the battery. 
Test I shows correlations varying from 
+.36 to —.11; test IL correlations vary 
from +.34 to +.07; test III from +.36 to 
—.o4; etc. And if one were to predict 
from the test battery the results in me- 
chanics for engineering students, for in- 
stance, one would find that the correla- 
tion between test I and mechanics is —.11, 
while it is +.41 between test IV and 
mechanics. For this reason Table 7 is 
taken to indicate that the test battery has 
great possibilities as a differential apti- 
tude battery. 


Most of our results in Table 7 are 
with the technology students. These stu- 
dents were selected for their studies by a 
very strict selection procedure, and are 
only the very best students. Since this 
restriction in range will affect the distri- 
bution of the test scores and the reliabil- 
ity of the grades, it will also lower the 
correlation coefficients. It can be sup- 
posed that for a student group with a 
normal range of ability the correlation 
coefhcients will be higher than those for 
the technology students shown in Table 


Revised editions of the test battery 
administered with more control of mo- 
tivational factors will better show the 
actual value of the battery for guidance 
and selection purposes. 

Comparison with results from the 
United States. The validation studies in 
the United States have given correlations 
between test results and grades of uni- 
versity students shown in Table 8. In 
comparing our results with those ob- 
tained in the United States, three main 
differences appear: 

1. In the United States the liberal arts group 


— 
12 
Test 
I II Ill IV VI Vil 
71 .10 -32 -42 -20 
109 -08 .27 .10 -30 -08 
109 -14 -41 -18 +32 
99 +02 +21 -26 -19 +17 
67 -36 -29 -35 -31 -27 -36 +20 
| 65 -36 354 «23 
7: 


TESTS FOR UNIVERSITY STUDENTS IN NORWAY 


TABLE 8 


CORRELATIONS BETWEEN TEST SCORES AND GRADES OF UNIVERSITY 
STUDENTS IN THE UNITED STATES 


Freshman First-Term N 


Test 


Course 


I 


Ill IV V VI Vil 


history and English 
gra 


“44 


-46 
-13 


Spanish 


Physics 


31 


-16 
+25 


Average of mathematics and 
drawing grades 
Engineering drawing 


293 -Ig 


-16 
.18 


-40 


23 


-40 
-17 


-42 


-14 


.07 
12 


.07 


-40 


45 


51 


.40 


.24 
.32 


-49 
43 


.42 
-46 


+37 


.24 


sag 
-37 


.42 


did better on the verbal tests than did the more 
mathematical-technical group. This is not true 
in the Norwegian results, where the liberal arts 
group earned lower average scores than any 
other university student group. 

2. In the United States it has been reported 
that factor analysis of the test battery isolated 
three different factors; viz., verbal, mathematical, 
and spatial-mechanical. In the Norwegian (re- 
sults factor analysis failed to single out any 
special spatial-mechanical factor. All the mathe- 
matical and spatial-mechanical kinds of tests 
form one big cluster in the factorial picture. 

3. The validation studies in the United States 
have given some correlations between test re- 
sults and grades that are higher than the cor- 
relations found thus far in Norway. If we com- 
pare the results in Tables 7 and 8, we see that 
the correlations found in Norway are usually 
lower and do not show quite the same degree 
of variation as do the results from the United 
States. 


Differences in the educational training 
of the student groups in the two coun- 
tries can be supposed to be a partial 
cause of the differing results. But other 
factors are probably more important 
causes. The intercorrelations between the 
tests were higher for all of the Norwe- 
gian student groups than for the Ameri- 
can group. The reliabilities of all the 
tests were greater in Norway than in the 
United States. Hardly any of the students 
in Norway were accustomed to this kind 
of testing and the use of special answer 


sheets. Hardly any of them completely 
finished a test. There is no reason to 
think that the Norwegian student on 
the average is a slower worker than the 
American student and that the different 
results in the two countries are due to 
some sort of inherited national character- 
istic of general speed of mental work. 
More likely they are due to a test-taking 
speed factor which has decisive influence 
in the testing situation for the Norwe- 
gian students. This test-taking speed 
factor is up to a certain point mainly 
determined by the amount of practice in 
taking tests. It has tended to increase the 
correlations between the tests. It is also 
evident in the high correlation of .65 
found between the verbal factor and the 
mathematical-mechanical factor. One of 
the results of the operation of this factor 
is that the chances for the liberal arts 
students to do better than the mathe- 
matical-mechanical students the 
linguistic tests have decreased, Another 
result is that the mathematical and spa- 
tial-mechanical tests, between which 
there may be supposed to be a fairly high 
correlation, have merged completely in 
the factorial picture. A third result is that 
the variation between the different corre- 


18 

= 


14 %. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


lations found with university grades has 
diminished, and the correlation coefhi- 
cients show a tendency to cluster around 
a fairly low average correlation. 

In future work with this test battery in 


Norway the influence of the speed factor 
will be reduced, There is every reason to 
believe that the test battery will yield 
better results when our work profits from 
our experiences thus far. 


IH. DEVELOPMENT OF A STUDENT INTEREST SCHEDULE 


In order to see what kind of interest 
schedules would be best suited for guid- 
ance work with Norwegian students, the 
Kuder, the Strong, and the Thurstone 
interest schedules were tried with groups 
of students. These tryouts seemed to in- 
dicate that the Thurstone schedule would 
be the best one for our purpose. This 
conclusion was not based on a refined 
statistical treatment of our results. It 
was based on three facts: 


1. A majority of the students reacted strongly 
against being required to indicate their prefer- 
ence and the activity they liked least in every 
item in the Kuder and the Strong schedules. 

2. A majority of the students reacted very 
strongly against answering questions over and 
over again in the Kuder and the Strong schedules. 

3. A majority of the students felt that too 
many of the questions in the Kuder and the 
Strong schedules were rather far removed from 
their field ‘of interest. 


The students seemed to like the Thur- 
stone schedule. As their reaction against 
the Kuder and the Strong schedules was 
rather extreme, we felt that it would not 
be advisable to continue the work with 
these two tests. 

The Norwegian edition of the Thur- 
stone schedule. Apart from some items 
which had to be replaced because of dif- 
ferences between Norwegian and Ameri- 
can occupations, the Norwegian edition 
of the Thurstone schedule was a transla- 
tion of the original. It was tried out on 
different groups of students, and reliabili- 
ties were calculated on the basis of data 
from 319 students. The corrected Pearson 
correlation coefficients are shown in 


Table g, together with the reliabilities 
reported by Thurstone. 


These reliabilities are all very satis- 
factory. The musical interest scores, as 
revealed by inspection of the scatter dia- 
gram, fell into two clearly separated 
clusters, one group with very low scores 
on both row and column, and one group 
with medium and high scores on both 
row and column. This kind of scatter 
diagram will of course raise the reliability 
considerably; it is probably caused by 
the fact that some students regard them- 
selves as having no musical ability and 
for this reason never prefer a musical 
activity, while others are fairly musical 
and rather often prefer a musical kind of 
activity. It is exceptional that a student 
regards himself as having no ability in a 
field. This is usually only the case as 
regards musical ability. 


TABLE 9 


RELIABILITY COEFFICIENTS FOR THURSTONE 
INTEREST SCHEDULE AND FOR NORWEGIAN 
EDITION OF THE THURSTONE SCHEDULE 


Thurstone Norwegian 


Reliabili- 
ties 


Reliabili- 
ties 


Interest Area 


Physical science -94 
Biological science -92 
Computational 
Business -93 
Executive -92 
Persuasive -93 
Linguistic .go 
Humanitarian 
Artistic +90 
Musical .96 


-93 
-95 
.89 
-94 
-97 


The average profiles for different 
groups of students showed very character- 
istic differences, and each profile had its 
highest score in the field of interest most 
closely related to the group’s field of 
study. But some profiles were flatter than 
would have been expected. From inter- 
views with the students several reasons 
for trying to develop a new interest 
schedule were discovered. All these 
reasons are connected with the fact that 
in the Thurstone schedule the students 
are to indicate their preference as regards 
vocations. In many cases the students like 
some activity which is part of a vocation, 
but they do not like all the activities of 
the job in question as a lifetime vocation. 
For this reason their liking of certain 
parts of a vocation did not influence their 
interest score. It was also discovered that 
in many cases the students did not know 
very much about the nature of vocations 
outside their own field of study. This 
might have the effect of raising the score 
in their own field of study in an artificial 
way. 

The Norwegian interest schedule. The 
main difference between the Thurstone 
schedule and the Norwegian schedule 
that was developed is that in the Nor- 
wegian schedule the items consist of pairs 
of activities instead of pairs of occupa- 
tions. These activities cannot be thought 
of as a lifetime occupation; they stretch 
over only a limited period of time. It 
was also supposed that the influence of 
social prestige and financial reward 
would be smaller when choosing one of 
two activities than when choosing one of 
two occupations. As an experiment, the 
Norwegian edition also included two per- 
sonality scores. ‘They were meant to indi- 
cate preference for working alone or for 
working in a team. This was done with- 
out increasing the number of items in the 
schedule, by arranging go items or pairs 


TESTS FOR UNIVERSITY 


STUDENTS IN NORWAY 15 
of activities in a parallelogram inside the 
schedule so that one of the activities in 
each item was an activity which is carried 
out in isolation, and the other in a team. 
One activity to be carried out in isolation 
and one to be carried out in a team were 
thus compared with each other twice 
within each of the ten fields. The Nor- 
wegian interest schedule included the fol- 
lowing fields: 


A. Art and archi- I. Language and 


tecture literature 
B. Technical J. Humanitarian 
C. Science and and social 


mathematics 
D. Medical 
E. Agricultural 
F. Sales 2 
G. Business and 
social-economic 
H. Law and politi- 
cal science 


1. Tendency to 
preter work- 
ing alone 

. Tendency to 
prefer work- 
ing in a team 


The musical score was dropped; the 
items in the other categories are adjusted 
to Norwegian fields of study and occu- 
pations. 

Reliabilities. The reliabilities were 
computed for a sample of 200 students 
from different fields of study. The Kuder- 
Richardson formula was used, except for 
the two personality scores (scores 1 and 
2), for which the split-half method was 
used. These reliabilities are given in 
Table 10. 

As the Kuder-Richardson method tends 
to give lower reliability coefficients than 
the split-half method, the reliabilities 
found were regarded as satisfactory ex- 
cept for scores 1 and 2. After further 
inspection of the results on scores 1 and 
2, they were dropped in the further work 
with the schedule. 

Since we were interested in the stability 
of the interest scores over a long period 
of time rather than in only one testing 


16 


TABLE 10 


RELIABILITY OF SCORES IN THE SEPARATE 
INTEREST FIELDS OF THE NORWEGIAN 
INTEREST SCHEDULE 


Interest Reliability | Interest Reliability 
Field* Coefficient Field* Coefficient 


A* . 89 
B 
81 
) 
E 51 
F - 66 


*For definition of the fields of interest see 
text. 


situation, we asked about 200 students 
to fill out the schedule a second time 
after an interval of one and a half years. 
The correlations between these two re- 
sults for five different groups of students 
and for the whole group are given in 
Table 11. 


From this table it seems as if the stability of 
the scores is generally lowest for the group of 
psychology students. This may perhaps be ex- 
plained by the rather special nature of psy- 
chology. Psychology is related to many other 
branches of science and of professional work, 
and a psychology student often changes his in- 
terests within the field of psychology. He may 
begin with a special interest in psychometrics, 
then for a time be strongly attracted to ciinical 
psychology, and then after a time change over to 
social psychology. It is reasonable to suppose 
that a change of interest like this would affect 


9%. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


the scores on the interest schedule. It seems 
unlikely that the same degree of change in in- 
terest would be possible within any other field 
of academic study. Another reason for the lack 
of stability of the interest scores of the psy- 
chology students may be that many psychology 
students begin their studies of psychology with 
a rather strong idealistic humanitarian-social 
motivation, without knowing much about the 
real nature of scientific psychology. After they 
have studied psychology for a time, however, 
their interest will be directed. more toward 
special scientific problems which they were not 
aware of earlier. This will also influence the 
stability of the interest scores. 

For all groups of students the interest 
areas which are specially related to their 
field of study and in which they make the 
highest average scores are also the areas 
in which they tend to show the lowest 
test-retest correlation. This is easily ex- 
plained by the fact that the highest 
average score for a group of students is 
usually made in the area for which the 
scores have the smallest degree of scatter, 
so that even a small difference in the raw 
scores at the two times of testing will 
affect the correlation coefficient consider- 
ably. For this reason the test-retest cor- 
relation coefficients for the total group of 
students can be supposed to give a better 
indication of the general stability of the 
scores over a longer period of time than 


TABLE rr 
Retest CORRELATIONS FOR NORWEGIAN INTEREST SCHEDULE 


(Time Interval Approximately 14 Years) 


Group of 


Interest Field* 


Students 


Psychology students 

Medical students 

Law and social-eco- 
nomics students 

Language and litera- 
ture students 

Pure science and 
mathematics 
students 


-44 .72 


-73 +74 57 


Total 


-79 


* For definition of the fields of interest see text. 


Male Female A* B C D H I Jj 
33 3 -75 +74 65 -74 +73 +57 
22 12 .86 .70 -67 .53 .62 
36 7 .7¢ .ss 66 597 
| 161 53 -68 .80 .80 .78 .69 .69 mm -72 -73 «75 


TESTS FOR UNIVERSITY STUDENTS IN NORWAY 


TABLE 12 


KupDER-RICHARDSON RELIABILITY COEFFICIENTS AND TEsTt-RETEST CORRELATIONS 
OF NORWEGIAN INTEREST SCHEDULE SCORES 


Measure 


Interest Field* 


D 


E 


Kuder-Richardson 


Test-retest 


correlation .68 .80 .80 .78 


reliability .80 . 88 .87 .88 


. 86 .87 .89 85 


-69 -79 -73 -75 


the same coefficient for a special group of 
students. 

The retest correlation coefficients found 
here probably also suffer from the limita- 
tion caused by the special scatter of the 
total student group. Several student 
groups are not included (for instance, 
students of agriculture, students of archi- 
tecture). With these groups included, the 
scatter of the scores would probably have 
been different. 

A comparison of the reliability co- 
efficients found by the Kuder-Richardson 
method on the basis of one testing and by 
the test-retest method for the different 
interest areas is given in Table 12. This 
table justifies the statement that the test- 
retest correlations in the present case are 
rather high. 

Intercorrelations and factor composi- 


* For definition of the fields of interest see text. 


TABLE 13 


tion. The correlations among the ten 
different interest scores are given in 
Table 13. 

Table 13 shows that all correlations 
above +.30 are between occupational 
fields which are related to each other and 
that negative correlations of any impor- 
tance are between occupational fields 
which are usually considered not to be 
related to each other. Most of these nega- 
tive correlations are between mathema- 
tical and technical fields on the one hand, 
and liberal arts fields on the other, The 
correlation matrix in Table 13 was fac- 
tored according to Thurstone’s complete 
centroid method. After three different 
rotations the result was the final factor 
matrix given in Table 14. 

On the basis of these results and of 
an analysis of the items in the interest 


INTERCORRELATIONS AMONG SCORES ON THE SEPARATE INTEREST FIELDS 
OF THE NORWEGIAN INTEREST SCHEDULE* 


Field 


A 


B Cc D 


F G H I 


J 

A —.059 —.027 —.005 025 —.005 —.034 169 325 
B -623 -096 -264 -054 .076 —.291 —.335 —.336 
+377 -334 —-099 030 —.1§7 —.316 —.163 
D 272 —.054 —.222 —.016 
128 203 .063 —.013 .142 
F -742 +232 —.055 
G +352 
H -636 
I 
J 


* For definition of the fields of interest see text. 


17 
At B Cc | |_| F G H I J 


@. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


TABLE 14 
Facror LOADINGS FOR NORWEGIAN INTEREST SCHEDULE 


Factor 


Communality 
IV 


-29 
-O2 
.O7 
-22 
-22 
.50 
-17 
-69 


* For definition of the fields of interest see text. 


areas with the highest loadings on the 
different factors, the following conclu- 
sions were reached: 

Factor II, with scores B, C, D, and E 
having the highest factor loadings, was 
called interest in natural science. 

Factor III, with scores F, G, and H 
having the highest factor lc idings, was 
difficult to name, but was described as 
a factor for interest in sales fields, ad- 
ministration, and industrial finance. 

Factor I, with scores H and J having 
the hghest factor loadings, was called 
interest in social welfare: 

Factor IV, with scores H, I, and J 
having the highest factor loadings, was 
described as an interest factor for phi- 
lology, combining the interests in lan- 
guage and literature. 

These results check reasonably well 
with ‘Thurstone’s study, in which he 
found four interest factors: interest for 
science, people, language, and business. 
Our factor for interest in natural science 
corresponds to Thurstone’s factor for 
interest in science; our interest factor for 
philology corresponds to his factor for 
language; our factor for social welfare 
corresponds to his factor for interest in 
people; and our interest factor for sales 
fields, administration, and_ industrial 
finance corresponds to Thurstone’s factor 


-39 


for business interests. 

Group profiles and prediction value. 
The average interest profiles for different 
groups of students were found in order 
to see whether it is justifiable to interpret 
the individual profiles in a meaningful 
way. Of course these profiles will be 
partly the result of the influence of the 
fact that these students already had 
chosen their field of study and of the 
length of time they had been studying. 
Nevertheless, very typical group pro- 
files may be taken as support for the self- 
interpreting value of the _ interest 
schedule. 

The mean value and the standard de- 
viation of the ten scores for different 
student groups are shown in Fig. 7 to 
13 (pp. 20-21) and in Table 15. 

It will be seen that each student group 
has the highest score in the interest field 
most closely related to its special field of 
study. The medical student group pro- 
vides the most typical result with a score 
in area “D” (medical fields) of 15.93 and 
a very small degree of scatter. The bus- 
iness and social-economics students show 
a similarly typical result. Scores have also 
been computed for groups of students not 
included in the present report, and they 
all show very typical results. 

Though nothing definite has been 


18 
* 
At -05 -157 
- .76 —.16 —.32 - 706 
D .50 — .32 .28 -436 
E -54 .08 .00 304 
F ~ .14 .05 -618 
H .48 .674 
I — .04 —.04 .83 .721 
J -20 -§2 -799 


TESTS FOR UNIVERSITY STUDENTS IN NORWAY 


TABLE 15 


MEAN Scores OF DiFFERENT STUDENT GROUPS IN THE SEPARATE FIELDS 
OF THE NORWEGIAN INTEREST SCHEDULE 


Science 


Business Language 
Engineer- and Mathe- and Social- and Psy- 
ing matics Medical Economics Law Literature cholozy 
Interest Students Szudents Students Students Students Students Students 
Field* Measure (N=104) (N=58) (N=82) (N=116) N=50) (N90) (N=66) 
At M 8.31 7.52 7.96 8.26 0-74 8.92 
SD 4-01 3.36 a. 3-79 3.85 4-25 3.88 
B M Ts, 1.41 6.98 7.11 5-54 5-52 5.94 
4- 4.87 4-40 4-74 4°43 4-29 3-99 
8. 3-38 5-93 5-59 4.06 5.12 5.70 
4- 4-31 3.83 3-25 3-86 4-45 3-85 
5- 0.43 15-93 5-96 5-52 6.64 11.44 
4- 5.68 2.32 3-74 3.89 4.38 3-97 
8.34 6.96 3.96 6.90 
4: 5-23 4-34 4-57 3.61 5.10 
3- 2.12 2. 10.21 6.32 3-74 4.00 
3: 2.78 2.42 4-13 4-73 3-48 3-95 
7. 4.00 4-41 16.88 11.06 7-20 8.15 
4- 2.70 3.90 2.39 4-40 3.78 4-29 
He 7.38 10.07 11.93 16.48 12.41 13.67 
4- 3-29 4 3.56 3-3" 4.52 2.90 
8. 9-55 10.28 9.91 11.24 16. 36 14.21 
3. 3-92 4.04 4.36 4.27 3-25 3.20 
6. 7.04 11.04 9.8: Q.22 11.47 14.62 
4: 5: 4- 4- 4.87 4.72 2.73 


proved about the prediction value of 
the schedule, the results given here are 
regarded as strong support for the view 


When the research project reported 
here was started, there existed no tests of 
aptitudes or interests for university stu- 
dents in Norway. The most important 
single study in the present project was 
that of adapting the Yale Educational 
Aptitude test battery for use in Norway. 
But the report also covers work with an 
adaptation of the American Council on 
Education Psychological Examination for 
College Freshmen, and with several in- 
terest schedules. 


Experiments with a General Intelligence 
Test 


A translation of the American Coun- 


* For definition of the fields of interest see text. 


IV. SUMMARY 


that the schedule, when used with care, 
will be of considerable value in student 
counseling. 


cil on Education Psychological Ex- 
amination for College Freshmen was 
employed. As regards the verbal part of 
this test, it was necessary to change this 
considerably by the substitution of new 
items better suited for use with Norwe- 
gian students. The Norwegian form of 
the test was tried out in all the usual 
ways and revised according to the results 
obtained. The final form of the test 
yields a satisfactory progression of mean 
scores for groups known to differ in in- 
tellectual ability. Validity coefficients for 
the test (correlations between total score 
on the test and final grades) range from 
18 to .71, with an average of about .50. 


@. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


DEFGHIJ 


- a 


Engineering students 
One standard deviation above and 
below the mean 
Fic. 7. Scores of engineering students on the 


Norwegian interest schedule. Letters on the X 
axis designate interest areas (see text). 


ABCDEFGHIJ Medical students 


One standard deviation above and 
below the mean 


Fic. g. Scores of medical students on the Nor- 
wegian interest schedule. Letters on the X axis 
designate interest areas (see text). 


Science and Math. students 


One standard deviation above and 
below the mean 


Fic. 8. Scores of university science and mathe- 
matics students on the Norwegian interest 
schedule. Letters on the X axis designate interest 
areas (see text). 


20 
AB 
20 
18 
ABCDEFGHtIS 
16 
Wr, 
FS YY 
0 


TESTS FOR UNIVERSITY STUDENTS IN NORWAY 


ABCDEFGHI1J ABCDEFGHI J 


16 - - - - 16 - = 


wv KYA 4- - - - 
- 8 - 
ZH odode 2 Vy 


Business and Social-Economics Language and Literature students 
students One standard deviation above and 
One standard deviation above and below the mean 
below the mean Fic. 12. Scores of language and literature stu- 
Fac. 10. Scaves of business and social-econcsics dents on the Norwegian interest schedule, Letters 
students on the Norwegian interest schedule, Let- on the X axis designate interest areas (see text). 


ters on the X axis designate interest areas (see 
text), 


ABCDEFG 


10 

My, Ly 


Law students Psychology students 
One standard deviation above and One standard deviation above and 
below the mean below the mean 


Fic. 11. Scores of law students on the Nor- Fic. 13. Scores of psychology students on the 
wegian interest schedule. Letters on the X axis Norwegian interest schedule. Letters on the X 
designate interest areas (see text), axis designate interest areas (see text). 


ABCDEFGHIJ 
20 — 20 


22 ~@. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


‘This is much higher than the correlation 
found, in comparable groups in Norway, 
between high school grades and grades in 
university work. 


Experiments with the Yale Educational 

Aptitude Test 

A Norwegian edition of the Yale 
Educational Aptitude Test was tried 
out on different groups of students in 
1949 and 1950. The Norwegian edition 
consisted of a translation into Norwegian 
of tests V, VI, and VII of the Yale test 
without any change of items; tests I, II, 
Ill, and IV, which depend on language 
to a greater extent than the others, were 
translated with the substitution of some 
new or modified items, A tryout of the 
battery with students in the fourth grade 
of high school (considered the equivalent 
of the college freshman year in the 
United States) showed that the compon- 
ent tests of the battery had adequate 
reliability—from .g3 to .98; these high 
coefhicients, however, were due in part 
to a speed factor. 

Beyond the initial tryout, five different 
groups of students were tested: 402 stu- 
dents in the fourth grade of high school, 
2g, students in the fifth grade of high 
school, 84 liberal arts university students, 
118 pure science and mathematics uni- 
versity students, and 206 students from 
the Institute of Technology in Trond- 
heim. All the university students had 
been studying from half a year to three 
years at the time of testing. 

a. Intercorrelations. The intercorrela- 
tions among the seven tests of the Yale 
battery (N=317 students in the fourth 
grade of high school. in Norway) range 
from .45 to .77; with one exception, the 
intercorrelations are all higher than 
those reported for college freshmen in 
the United States. The higher intercor- 


relations in the Norwegian group may 
reflect a “clerical speed factor,” since the 
Norwegian students were comparatively 
unfamiliar with objective tests and 
answer sheets, ‘he wider use of “elective” 
subjects in American education, and the 
consequent earlier and greater special- 
ization among the United States students, 
may also help account for the lower in- 
tercorrelation in the American results. 

b. Factor analysis. On the basis of the 
intercorrelations for fourth-grade high 
school students taking all 20 subtests 
in the battery, a factor analysis (by 
Thurstone’s complete centroid method, 
followed by necessary rotations) was per- 
formed. The results show that the sub- 
tests can be classified into two main 
groups: (a) a wide linguistic-verbal 
group, and (b) a wide quantitative-spatial 
group. It has not been possible to dis- 
criminate any clearly spatial-mechanical 
factor or corresponding group of subtests 
from the quantitative-spatial group. 

c. Profiles. Students in different curri- 
cular groups differ in the manner gener- 
ally to be expected, the language line 
students making relatively higher scores 
in the linguistic-verbal group of tests, 
and the science and mathematics line 
students making comparatively higher 
scores in the quantitative-spatial group of 
tests. However, the profiles of the lan- 
guage line high school students and of 
the university liberal arts students are 
lower than those of the pure science and 
mathematics line (i.e., the average scores 
of the former groups are in general 
lower), a fact due at least in part to selec- 
tive factors in the choice of, or admission 
to, the different curricula (the standards 
for the pure science and mathematics line 
being higher). 

d. Validation studies. Correlations be- 
tween the seven tests of the Yale battery 


il 


TESTS FOR UNIVERSITY 


and university grades, for a group of pure 
science and mathematics students and 
for technology students, range from —.11 
to .42. The rather low level of these cor- 
relations is probably due to (a) the high 
degree of selection in the group; (b) the 
presence of an excessive “clerical speed” 
factor; and (c) lack of motivation among 
the students taking the test. In future 
work with this test battery in Norway, 
the influence of the speed factor will be 
reduced; this, together with better con- 
trolled motivation of the students, should 
lead to improved validity coefficients. 


Experiments with Interest Schedules 

Tryouts of the Kuder, Strong, and 
Thurstone interest schedules led to un- 
satisfactory or only qualifiedly satisfac- 
tory results. The Kuder and Strong 
tests were discontinued early, because a 
majority of the students reacted strongly 
against being required to indicate some 
preference when, in fact, their preference 
might be quite mild or nonexistent. The 
Thurstone schedule proved acceptable 
to the students, and yielded high reliabil- 
ity coefficients, but some of the profiles of 
interest scores for different groups were 
flatter than would have been expected. 
Accordingly, a revision of the Thurstone 
schedule was prepared, The main differ- 
ence between the Thurstone schedule 
and the Norwegian schedule is that in 
the Norwegian schedule the items consist 
of pairs of activities instead of pairs of 
occupations. The Norwegian interest 
schedule includes the following fields: 


A. Art and archi- G. Business and 


tecture social economic 
B. Technical H. Law and politi- 
C. Science and cal science 
mathematics I. Language and 
D. Medical literature 
E. Agricultural J. Humanitarian 
F. Sales and social 


STUDENTS IN NORWAY 23 

a. Reliability. Kuder-Richardson reli- 
ability coefficients of scores in the 10 
different interest areas ranged from .80 
to .89 (N= 220 students from different 
fields of study), Retest coefficients (with 
an interval of approximately one and a 
half years) ranged from .68 to .80 (V=214 
students). 

b. Intercorrelations and factor com- 
position. Intercorrelations among the 
scores in the different interest areas range 
from —.336 to .636. All correlations 
above .g0 are between occupational fields 
which are related to each other, Most of 
the negative correlations are between 
mathematical and technical fields on the 
one hand, and liberal arts fields on the 
other. Factor analysis by Thurstone’s 
complete centroid method, followed by 
three rotations, yielded four factors, as 
follows: factor I, interest in social wel- 
fare; factor II, interest in natural science; 
factor III, interest in sales fields, admin- 
istration, and industrial finance; factor 
IV, interest in language and literature. 
These correspond reasonably well with 
Thurstone’s four interest factors (interest 
in people, science, business, and lan- 
guage). 

c. Profiles. Group profiles for engineer- 
ing students, science and mathematics 
students, medical students, business and 
social-economics students, law students, 
language and literature students, and 
psychology students show that each stu- 
dent group makes its highest score in the 
interest area most closely related to its 
special field of study. Of course, these 
profiles may be partly the result of, rather 
than a motivational factor leading to, 
the choice of the particular field of study. 
On the whole, however, the profiles pro- 
vide some assurance of the value of the 
interest schedule for purposes of student 
counseling. 


24 9%. SKARD, I. M. AURSAND, AND L. J. BRAATEN 


REFERENCES 


1. Crawrorp, A. B., & BURNHAM, P. S. Forecast- Univer. Press, 1946. 
ing college achievement; a survey of apti- 2. Srutr, D. B., Dickson, G, S., JorDaN, T. F., & 
tude tests for higher education. Part I: ScHLoerB, L. Predicting success in profes- 
general considerations in the measurement sional schools. Washington, D.C.: American 
of academic promise. New Haven: Yale Council on Education, 1949. 


(Accepted for early publication August 26, 1954) 


| 
| 


#3 
y 
rs 
| 


5 ¥ 
; 
nek 
j 
4 
Gg 
; 
ant 
‘ 


